AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Academic VQA

# Academic VQA

Llava Gemma 7b
LLaVA-Gemma-7b is a large multimodal model trained based on the LLaVA-v1.5 framework, using google/gemma-7b-it as the language backbone combined with a CLIP visual encoder, suitable for multimodal understanding and generation tasks.
Image-to-Text Transformers English
L
Intel
161
11
Llava V1.5 7b Gguf
LLaVA is an open-source multimodal chatbot, fine-tuned on LLaMA/Vicuna and trained with GPT-generated multimodal instruction-following data.
Image-to-Text
L
granddad
13
0
Llava V1.5 13B AWQ
LLaVA is an open-source multimodal chatbot, fine-tuned on GPT-generated multimodal instruction-following data based on LLaMA/Vicuna.
Text-to-Image Transformers
L
TheBloke
141
35
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase